ADAPTIVE FUZZY SYSTEM 
FOR 3-D VISION 


5/ 6 - (=3 

fi' / 7 

N 93-22216 


FINAL REPORT 
ON 

NASA/JSC CONTRACT 
# NAG 9-509 

CONTRACT PERIOD: 4-01-91 TO 3-31-92 


SUBMITTED BY 

Dr. Sunanda Mitra 
Principal Investigator and 
Associate Professor, 
Department of Electrical Engineering. 
Texas Tech University 
Lubbock, Texas 79409-3102 

To 

Dr. Robert Lea 

Fuzzy Logic Technical Coordinator, 
Information System Directorate, 
NASA- Johnson Space Center PT4 
Houston, Texas 77058 

April 15,1992 


309 



PROJECT SUMMARY 


A novel adaptive fuzzy system using the concept of the Adaptive Resonance Theory (ART) type 
neural network architecture and incorporating fuzzy c-means (FCM) system equations for reclassification 
of cluster centers has been developed. 

The Adaptive Fuzzy Leader Clustering (AFLC) architecture is a hybrid neural-fuzzy system which 
learns on-line in a stable and efficient manner. The system uses a control structure similar to that found 
in the Adaptive Resonance Theory (ART-1) network to identify the cluster centers initially. The initial 
classification of an input takes place in a two stage process; a simple competitive stage and a distance 
metric comparison stage. The cluster prototypes are then incrementally updated by relocating the centroid 
positions from Fuzzy c - Means (FCM) system equations for the centroids and the membership values. 
The operational characteristics of AFLC and the critical parameters involved in its operation are 
discussed. The performance of the AFLC algorithm is presented through application of the algorithm to 
the Anderson Iris data, and laser-luminescent fingerprint image data. The AFLC algorithm successfully 
classifies features extracted from real data, discrete or continuous, indicating the potential strength of 
this new clustering algorithm in analyzing complex data sets. 

This hybrid neuro-fuzzy AFLC algorithm will ehnance analysis of a number of difficult recognition 
and control problems involved with Tethered Satellite Systems and on-orbit space shuttle attitude 
controller. 
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I. INTRODUCTION 


Cluster analysis has been a significant research area in pattern recognition for a number of years( 1 ]- 
[4], Since clustering techniques are applied to the unsupervised classification of pattern features, a neural 
network of the Adaptive Resonance Theory (ART) type[5],[6] appears to be an appropriate candidate for 
implementation of clustering algorithms[7]-[10). Clustering algorithms generally operate by optimizing 
some measures of similarity. Classical, or crisp, clustering algorithms such as ISODATA(ll) partition 
the data such that each sample is assigned to one and only one cluster. Often with data analysis it is 
desirable to allow membership of a data sample in more than one class, and also to have a degree of 
belief that the sample belongs to each class. The application of fuzzy set theory! 12) to classical 
clustering algorithms has resulted in a number of algorithms! 13M 16] with improved performance since 
unequivocal membership assignment is avoided. However, estimating the optimum number of clusters in 
any real data set still remains a difficult problem! 17]. 

It is anticipated, however, that a valid fuzzy cluster measure implemented in an unsupervised neural 
network architecture could provide solutions to various real data clustering problems. The present work 
describes an unsupervised neural network architecture! 18], [19] developed from the concept of ART-1[5] 
while including a relocation of the cluster centers from FCM system equations for the centroid and the 
membership values[2]. Our AFLC system differs from other fuzzy ART-type clustering algorithms 
[20],[21] incorporating fuzzy min-max learning rules. The AFLC presents a new approach to 
unsupervised clustering, and has been shown to correctly classify a number of data sets including the Iris 
data This fuzzy modification of an ART-1 type neural network, i.e. the AFLC system, allows 
classification of discrete or analog patterns without a priori knowledge of the number of clusters in a data 
set. The optimal number of clusters in many real data sets is, however, still dependent on the validity of 
the cluster measure, crisp or fuzzy, employed for a particular data set. 
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II. ADAPTIVE FUZZY LEADER CLUSTERING SYSTEM AND ALGORITHM 


A. AFLC System and Algorithm Overview 

AFLC is a hybrid neural-fuzzy system which can be used to learn cluster structure embedded in 
complex data sets, in a self-organizing, stable manner. This system has been adapted from the concepts of 
ART-1 structure which is limited to binary input vectors[5]. Pattern classification in ART-1 is achieved 

by assigning a prototype vector to each cluster that is incrementally updated! 10]. 

Let Xj = { Xj j, Xj2, ... Xjp } be the j th input vector for 1 <1 j <> N where N is the total number of 

samples in the data set and p is the dimension of the input vectors. The initialization and updating 
procedures in ART- 1 involve similarity measures between the bottom -up weights (t\j where k = 1 *2,...,p) 
and the input vector (Xj), and a verification of Xj belonging to the i th cluster by matching of the top- 
down weights (t|fc) with Xj. For continuous-valued features, the above procedure is changed as in ART- 
2(6]. However if the ART-type networks are not made to represent biological networks, then a greater 
flexibility is allowed to the choice of similarity metric. A choice of Euclidean metric is made in 
developing the AFLC system while keeping a simple control structure adapted from ART-1. 

Figure 1 

Figures 1(a) and 1(b) represent the AFLC system and operation for initialization and comparison of 
cluster prototypes from input feature vectors, which may be discrete or analog. The updating procedure in 
the AFLC system involves relocation of the cluster prototypes by incremental updating of the centroids 
vj, (the cluster prototypes), from FCM system equations[2] for vj and mj as given below : 

v-ir *— : 'S'SC <|) 

/«■ 
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; 1 < i < C; l£j<N (2) 

where Nj is the number of samples in cluster i and C is the number of clusters. The vj's and Mij’s are 
recomputed over the entire data sample N. 

As described here, AFLC is primarily used as a classifier of feature vectors employing an on-line 
learning scheme. Figure 1(a) shows a p-dimensional discrete or analog-valued input feature vector, X to 
the AFLC system. The system is made up of the comparison layer, the recognition layer, and the 
surrounding control logic. The AFLC algorithm initially starts with the number of clusters (C) set to zero. 
The system is initialized with the input of the first feature vector X. Similar to leader clustering, this first 
input is said to be the prototype for the first cluster. The normalized input feature vector is then applied to 
the bottom-up weights in a simple competitive learning scheme, or dot product. The node that receives 
the largest input activation Y is chosen as the prototype vector as is done in the original ART-1. 

Y, = max{^X y A } ; 1 ^ j ^ N (3) 

*=1 

Therefore the recognition layer serves to initially classify an inpuL This first stage classification 
activates the prototype or top-down expectation (t*) for a cluster, which is forwarded to the comparison 
layer. The comparison layer serves both as a fan-out site for the inputs, and the location of the 
comparison between the top-down expectation and the input The control logic with an input enable 
command allows the comparison layer to accept a new input as long as a comparison operation is not 
currently being processed. The control logic with compare imperative command disables the acceptance 
of new input and initiates comparison between the cluster prototype of Y s i.e., the centroid vj and the 
current input vector, using equation (4). The reset signal is activated when a mismatch of the first and 
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second input vectors occurs according to the criterion of a distance ratio threshold as expressed by 
equation (4) 



where : k = the number of samples in class i and -y d 2 (X ; ,V l ) is the Euclidean distance as 

indicated in equation(5). 

d 1 (Xj - v, ) = |jc j-Vj | 2 (5) 

If the ratio R is less than a user-specified threshold T, then the input is found to belong to the cluster 
originally activated by the simple competition. The choice of the value of t is critical and is found by a 
number of initial runs. Preliminary runs with x varying over a range of values yield a good estimate of the 
possible number of clusters in unlabeled data sets. 

When an input is classified as belonging to an existing cluster, it is necessary to update the 
expectation (prototype) and the bottom-up weights associated with that cluster. First, the degree of 
membership of X to the winning cluster is calculated. This degree of membership, p, gives an indication, 
based on the current state of the system, of how heavily X should be weighted in the recalculation of the 
class expectation. The cluster prototype is then recalculated as a weighted average of all the elements 
within the cluster. The update rules are as follows: the membership value Pjj of the current input sample 
Xj in the winning class i, is calculated using equation (2), and then the new cluster centroid for cluster i is 
generated using equation (1). As with the FCM, m is a parameter which defines the fuzziness of the 
results and is normally set to be between 1.5 and 30. For the following applications, m was 
experimentally set to 2. 

The AFLC algorithm can be summarized by the following steps : 

1. Start with no cluster prototypes, C = 0. 

2. Let Xj be the next input vector. 
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3. Find the first stage winner Yj , as the cluster prototype with the maximum dot-product. 

4. If Yj does not satisfy the distance ratio criterion, then create a new cluster and make its 
prototype vector be equal to Xj. Output the index of the new cluster. 

5. Otherwise, update the winner cluster prototype Yj by calculating the new centroid and 
membership values using equations (1) and (2). Output the index of Yj. Go to Step 2. 

A flow chart of the algorithm is shown in Figure 2. 

Figure 2 


III. OPERATIONAL CHARACTERISTICS OF AFLC 
A. Match-based Learning and the Search 

In match-based learning, a new input is learned only after being classified as belonging to a 
particular class. This process ensures stable and consistent learning of new inputs by updating parameters 
only for the winning cluster and only after classification has occurred. This differs from error-based 
learning schemes, such as backpropagation of error, where new inputs are effectively averaged with old 
learning resulting in forgetting and possibly oscillatory weight changes. In [5] match-based learning is 
referred to as resonance, hence the name Adaptive Resonance Theory. 

Because of its ART-like control structure, AFLC is capable of implementing a parallel search when 
the distance ratio does not satisfy the thresholding criterion. The search is arbitrated by appropriate 
control logic surrounding the comparison and recognition layers of Figure 1. This type of search is 
necessary due to the incompleteness of the classification at the first stage. For illustration, consider the 
two vectors (1,1) and (5,5). Both possess the same unit vector. Since the competition in the bottom-up 
direction consists of measuring how well the normalized input matches the weight vector for each class i, 
these inputs would both excite the same activation pattern in the recognition layer. In operation, the 
comparison layer serves to test the hypothesis returned by the competition performed at the recognition 
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layer. If the hypothesis is disconfirmed by the comparison layer. i.e. R > t, then the search phase 
continues until the correct cluster is found or another cluster is created. Normalization of the input 
vectors (features) is done only in the recognition layer for finding the winning node. This normalization 
is essential to avoid large values of the dot products of the input features and the bottom-up weights and 
also to avoid initial misclassification arising due to large variations in magnitudes of the cluster 
prototypes. The search process, however, renormalizes only the centroid and not the input vectors again. 

B. Determining the Number of Output Classes 

AFLC utilizes a dynamic, self-organizing structure to learn the characteristics of the input data. As a 
result, it is not necessary to know the number of clusters a priori; new clusters are added to the system as 
needed. This characteristic is necessary for autonomous behavior in practical situations in which 
nonlinearities and nonstationarity are found. 

Clusters are formed and trained, on-line, according to the search and learning algorithms. Several 
factors affect the number, size, shape, and location of the clusters formed in the feature space. Although 
it is not necessary to know the number of clusters which actually exist in the data, the number of clusters 
formed will depend upon the value of t. A low threshold value will result in the formation of more 
clusters because it will be more difficult for an input to meet the classification criteria. A high value of i 
will result in fewer, less dense clusters. For data structures having overlapping clusters, the choice of t is 
critical for correct classification whereas for nonoverlapping cluster data, the sensitivity of i is not a 
significant issue. In the latter case the value of t may vary over a certain range, yet yielding correct 
classification. Therefore the sensitivity of 1 is highly dependent on specific data structure as shown in 
Figure 1(c). The relationship between t and the optimal number of clusters in a data set is currently 
being studied. 

C. Dynamic Cluster Sizing 
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As described earlier, x is compared to a ratio of vector norms. The average distance parameter for a 
cluster is recalculated after the addition of a new input to that cluster, therefore, this ratio (R) represents a 
dynamic description of the cluster. If the inputs are dense around the cluster prototype, then the size of 
the cluster will decrease, resulting in a more stringent condition for membership of future inputs to that 
class. If the inputs are widely grouped around the cluster prototype, then this will result in less stringent 
conditions for membership. Therefore, the AFLC clusters have a self-scaling factor which tends to keep 
dense clusters dense while allowing loose clusters to exist. 

D. The Fuzzy Learning Rule 

In general, the AFLC architecture allows learning of even rare events. Use of the fuzzy learning rule 
in the form of equations (1) and (2), maintains this characteristic. In weighted rapid leaming[5], the 
learning time is much shorter than the entire processing time and the adaptive weights are allowed to 
reach equilibrium on each presentation of an input, but the amount of change in the prototype is a 
function of the input and its fuzzy membership value (jljj). Noisy features which would normally 
degrade the validity of the class prototype are assigned low weights to reduce the undesired affect. In the 
presence of class outliers, assigning low memberships to the outliers lead to correct classification. 
Normalization of membership is not involved in this process. However, a new cluster of outliers only can 
be formed during the search process[22]. Development of such outlier/noise cluster in AFLC is currently 
under progress. 

Weighted rapid learning also tends to reinforce the decision to append a new cluster. This is due to 
the fact that, by definition, the first input to be assigned to a node serves as that node's first prototype, 
therefore, that sample has a membership value of one. Future inputs are then weighted by how well they 
match the prototype. Although the prototype does change over time, as described in the algorithm, each 
sample retains its weight which tends to limit moves away from the current prototype. Thus the clusters 
possess a type of inertia which tends to stabilize the system by making it more difficult for a cluster to 
radically change its prototype in the feature space. 
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Finally, the fuzzy learning rule is stable in the sense that the adaptive weights represent a normalized 
version of the cluster centroid, or prototype. As such, these weights are bounded on [0,1] and are 
guaranteed not to approach infinity. 

E. AFLC as a General Architecture 

As with most other clustering algorithms, the size and shape of the resultant clusters depends on the 
metric used. The use of any metric will tend to influence the data toward a solution which meets the 
criteria for that metric and not necessarily to the best solution for the data. This statement implies that 
some metrics are better for some problems than are others. The use of a Euclidean metric is convenient, 
but displays the immediate problem that it is best suited to simple circular cluster shapes. The use of the 
Mahal an obis distance accounts for some variations in cluster shape, but its non-linearity serves to place 
constraints on the stability of its results. Also, as with other metrics, the Euclidean and Mahalanobis 
distance metrics lose meaning in an anisotropic space. 

IV. TESTS AND RESULTS: FEATURE VECTOR CLASSIFICATION 
A. Clustering of the Anderson Iris Data 

The Anderson Iris data set[23], consists of 150 4-dimensional feature vectors. Each pattern 
corresponds to characteristics of one flower from one of the species of Iris. Three varieties of Iris are 
represented by 50 of the feature vectors. This data set is popular in the literature and gives results by 
which AFLC can be compared to similar algorithms. 

We had 52 runs of the AFLC algorithm for the Iris data for 13 different values of t, with 4 runs for 
each i. Figure 1(c) shows the t-C graph. With Euclidean distance ratio and t ranging between 4.5 and 
5.5, the sample data was classified into 3 clusters with only 7 misclassiflcations. The misclassified 
samples actually belonged to Iris versicolor, cluster #2, and were misclassified as Iris virginica, cluster 
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#1. From Figure 1(c) it can be observed that the optimal number of clusters can be determined from the i 


-C graph as the value of C that has = 0 ; for C * 1 , for the maximum possible range of t. 

dx 

Figure 3, shows the input Iris data clusters using only three features for each sample data point. 
Figure 4a shows the computed centroids of the three clusters based on all four features. The intercluster 
Euclidean distances are found to be 1.75 (dj 2 ). 4.93 (d 23 >, and 3.29 (di 3 >. djj is the intercluster 
distance between clusters i & j. The comparatively smaller intercluster distance between clusters 1 and 2 
indicates the proximity of these clusters. Figure 4b shows a confusion matrix that summarizes the 
classification results. 

Figure 3 

Figure 4 

B. Classification of Noisy Laser-luminescent Fingerprint Image Data 

Fingerprint matching poses a challenging clustering problem. Recent developments in automated 
fingerprint identification systems employ primitive and computationally intensive matching techniques 
such as counting ridges between minutae of the fmgerprints[24]. Although the technique of laser 
luminescent image acquisition of latent fingerprint provide often identifiable images[25], these images 
suffer from amplified noise, poor contrast and nonuniform intensity. Conventional enhancement 
techniques such as adaptive binarization and wedge filtering provide enhancement at the expense of 
significant loss of information necessary for matching. Recent work[26] presents a novel three stage 
matching algorithm for fingerprint enhancement and matching. Figure 5b shows the enhanced image of 
5a subsequent to selective Fourier spectral enhancement and bandpass filtering. We used the AFLC 
algorithm to cluster three different classes of fingerprint images using seven invariant moment 
features[26),[27) computed from images that are enhanced(26]. A total of 24 data samples are used, each 
sample being a 7-dimensional moment feature vector. These moment invariants are a set of nonlinear 
functions which are invariant to translation, scale, & rotation. The three higher order moment features 
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are given less weights thus reducing the affect of noise and leading to proper classification. The t-C 
graph for the fingerprint data in Figure 1(c) shows a range of i from 3.0 to 4.5 for which proper 
classification resulted. The fingerprint data has also been correctly classified by a k-nearest neighbor 
clustering using only four moment features[26]. Euclidean distances of these clusters indicate that the 
clusters are well separated which is consistent with the comparatively larger range of t found for proper 
classification. Figures 5a and 5b represent one fingerprint class before and after enhancement Figure 6a 
shows the computed centroids of three fingerprint clusters. Figure 6b shows a confusion matrix that 
indicates correct classification results. 

Figure 5, Figure 6 

V. CONCLUSION 

It is possible to apply many of the concepts of AFLC operation to other control structures. Other 
approaches to Fuzzy ART are being explored[20],[21] that could also be used as the control structure for 
a fuzzy learning rule. Choices also exist in the selection of class prototypes. With some modification, 
any of these techniques can be incorporated into a single AFLC system or a hierarchical group of 
systems. The characteristics of that system will depend upon the choices made. 

While AFLC does not solve all the problems associated with unsupervised teaming, it does possess a 
number of desirable characteristics. The AFLC architecture learns and adapts on-line, such that it is not 
necessary to have a priori knowledge of all data samples or even of the number of clusters present in the 
data However the choice of t is critical and requires some a priori knowledge of the compactness and 
separation of clusters in the data structure. Learning is match-based ensuring stable, consistent learning of 
new inputs. The output is a crisp classification and a degree of confidence for that classification. 
Operation is also very fast, and can be made faster through parallel implementation. A recent work(28] 
shows a different approach to neural-fuzzy clustering by integrating Fuzzy C - means model with 
Kohonen neural networks. A comparative study of these recently developed neural-fuzzy clustering 
algorithms is needed. Future work will involved further modification of the AFLC system and algorithm 
for analyzing simulation data of the TSS system[29] and for automated attitude controller design of on- 
orbit shuttle [30], 
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FIGURE CAPTIONS 


Figure 1. Operation characteristics of AFLC Architecture . i(a) shows the initial stage of identifying 
a cluster prototype, 1(b) shows the comparison stage using the criterion of Euclidian distance 
ratio R > i to reject new data samples to the cluster prototype. The reset control implies the 
deactivation of the original prototype and activation of a new cluster prototype and 1(c) 
shows the t - c graph for choosing 1 for unlabelled datasets. 

Figure 2. Flow-chart of the AFLC Algorithm 

Figure 3. Iris Data Represented by Three-Dimensional Features 

Figure 4a. Computed Centroids of Three Iris Clusters Based on All Four Feature Vectors 

Figure 4b. Iris Cluster Classification Results shown as a confusion matrix 

Figure 5a. A Noisy Laser-luminescent Fingerprint Image 

Figure 5b. The Enhanced Image of 5a. by Selective Fourier Spectral Filtering 

Figure 6a. Computed Centroids of Three Fingerprint Clusters in Seven-Dimensional Vector Space 

Figure 6b. Fingerprint Data Classification Results 
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